Normalized k-means clustering of hyper-rectangles

نویسنده

  • Marie Chavent
چکیده

Interval variables can be measured on very different scales. We first remind a general methodology used for measuring the dispersion of a variable from an optimal center and we define two measures of dispersions associated to two optimal ”centers” for interval variables. Then we study the relations between the standardization of a data table and the use in clustering of a normalized distance. Finally we define two normalized distances between hyper-rectangles and their use in two normalized k-means clustering algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effective classification of 3D image data using partitioning methods

We propose partitioning-based methods to facilitate the classification of 3-D binary image data sets of regions of interest (ROIs) with highly non-uniform distributions. The first method is based on recursive dynamic partitioning of a 3-D volume into a number of 3-D hyper-rectangles. For each hyper-rectangle, we consider, as a potential attribute, the number of voxels (volume elements) that bel...

متن کامل

Weighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering

Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...

متن کامل

Mixtures of Rectangles: Interpretable Soft Clustering

To be eeective, data-mining has to conclude with a succinct description of the data. To this end, we explore a clustering technique that nds dense regions in data. By constraining our model in a speciic way, we are able to represent the interesting regions as an intersection of intervals. This has the advantage of being easily read and understood by humans. Speciically, we t the data to a mixtu...

متن کامل

Towards a Simple Clustering Criterion Based on Minimum Length Encoding

We propose a simple and intuitive clustering evaluation criterion based on the minimum description length principle which yields a particularly simple way of describing and encoding a set of examples. The basic idea is to view a clustering as a restriction of the attribute domains, given an example's cluster membership. As a special operational case we develop the so-called rectangular uniform ...

متن کامل

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005